Statistical file matching of flow cytometry data

نویسندگان

  • Gyemin Lee
  • William Finn
  • Clayton Scott
چکیده

Flow cytometry is a technology that rapidly measures antigen-based markers associated to cells in a cell population. Although analysis of flow cytometry data has traditionally considered one or two markers at a time, there has been increasing interest in multidimensional analysis. However, flow cytometers are limited in the number of markers they can jointly observe, which is typically a fraction of the number of markers of interest. For this reason, practitioners often perform multiple assays based on different, overlapping combinations of markers. In this paper, we address the challenge of imputing the high-dimensional jointly distributed values of marker attributes based on overlapping marginal observations. We show that simple nearest neighbor based imputation can lead to spurious subpopulations in the imputed data and introduce an alternative approach based on nearest neighbor imputation restricted to a cell's subpopulation. This requires us to perform clustering with missing data, which we address with a mixture model approach and novel EM algorithm. Since mixture model fitting may be ill-posed in this context, we also develop techniques to initialize the EM algorithm using domain knowledge. We demonstrate our approach on real flow cytometry data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A proposal for a flow cytometric data file standard.

The increasing complexity of multiparameter data collection and analysis in flow cytometry and the development of relatively inexpensive arc-lamp-based flow cytometers, which increases the probability that laboratories or institutions may have more than one type of instrument, creates a need for shareable analysis programs and for the transport of flow cytometric data files within an installati...

متن کامل

Data File Standard for Flow Cytometry, version FCS 3.1.

The flow cytometry data file standard provides the specifications needed to completely describe flow cytometry data sets within the confines of the file containing the experimental data. In 1984, the first Flow Cytometry Standard format for data files was adopted as FCS 1.0. This standard was modified in 1990 as FCS 2.0 and again in 1997 as FCS 3.0. We report here on the next generation flow cy...

متن کامل

Proposed new data file standard for flow cytometry, version FCS 3.0.

In 1984, the first flow cytometry data file format was proposed as Flow Cytometry Standard 1.0 (FCS1.0). FCS 1.0 provided a uniform file format allowing data acquired on one computer to be correctly read and interpreted on other computers running a variety of operating systems. That standard was modified in 1990 and adopted by the Society of Analytical Cytology as FCS 2.0. Here, we report on an...

متن کامل

An Empirical Study of Cluster Evaluation Metrics using Flow Cytometry Data

A wide range of abstract characteristics of partitions have been proposed for cluster evaluation. We empirically evaluated the performance of these metrics for flow cytometry data and found that the set-matching metrics perform closest to human.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of biomedical informatics

دوره 44 4  شماره 

صفحات  -

تاریخ انتشار 2011